57 research outputs found
Network Resource Allocation via Stochastic Subgradient Descent: Convergence Rate
This paper considers a general stochastic resource allocation problem that
arises widely in wireless networks, cognitive radio, networks, smart-grid
communications, and cross-layer design. The problem formulation involves
expectations with respect to a collection of random variables with unknown
distributions, representing exogenous quantities such as channel gain, user
density, or spectrum occupancy. We consider the constant step-size stochastic
dual subgradient descent (SDSD) method that has been widely used for online
resource allocation in networks. The problem is solved in dual domain which
results in a primal resource allocation subproblem at each time instant. The
goal here is to characterize the non-asymptotic behavior of such stochastic
resource allocations in an almost sure sense.
It is well known that with a step size of , {SDSD} converges to an
-sized neighborhood of the optimum. In practice however,
there exists a trade-off between the rate of convergence and the choice of
. This paper establishes a convergence rate result for the SDSD
algorithm that precisely characterizes this trade-off. {Towards this end, a
novel stochastic bound on the gap between the objective function and the
optimum is developed. The asymptotic behavior of the stochastic term is
characterized in an almost sure sense, thereby generalizing the existing
results for the {stochastic subgradient} methods.} For the stochastic resource
allocation problem at hand, the result explicates the rate with which the
allocated resources become near-optimal. As an application, the power and
user-allocation problem in device-to-device networks is formulated and solved
using the {SDSD} algorithm. Further intuition on the rate results is obtained
from the verification of the regularity conditions and accompanying simulation
results
Tracking Moving Agents via Inexact Online Gradient Descent Algorithm
Multi-agent systems are being increasingly deployed in challenging
environments for performing complex tasks such as multi-target tracking,
search-and-rescue, and intrusion detection. Notwithstanding the computational
limitations of individual robots, such systems rely on collaboration to sense
and react to the environment. This paper formulates the generic target tracking
problem as a time-varying optimization problem and puts forth an inexact online
gradient descent method for solving it sequentially. The performance of the
proposed algorithm is studied by characterizing its dynamic regret, a notion
common to the online learning literature. Building upon the existing results,
we provide improved regret rates that not only allow non-strongly convex costs
but also explicating the role of the cumulative gradient error. Two distinct
classes of problems are considered: one in which the objective function adheres
to a quadratic growth condition, and another where the objective function is
convex but the variable belongs to a compact domain. For both cases, results
are developed while allowing the error to be either adversarial or arising from
a white noise process. Further, the generality of the proposed framework is
demonstrated by developing online variants of existing stochastic gradient
algorithms and interpreting them as special cases of the proposed inexact
gradient method. The efficacy of the proposed inexact gradient framework is
established on a multi-agent multi-target tracking problem, while its
flexibility is exemplified by generating online movie recommendations for
Movielens M dataset
Asynchronous Decentralized Stochastic Optimization in Heterogeneous Networks
We consider expected risk minimization in multi-agent systems comprised of
distinct subsets of agents operating without a common time-scale. Each
individual in the network is charged with minimizing the global objective
function, which is an average of sum of the statistical average loss function
of each agent in the network. Since agents are not assumed to observe data from
identical distributions, the hypothesis that all agents seek a common action is
violated, and thus the hypothesis upon which consensus constraints are
formulated is violated. Thus, we consider nonlinear network proximity
constraints which incentivize nearby nodes to make decisions which are close to
one another but not necessarily coincide. Moreover, agents are not assumed to
receive their sequentially arriving observations on a common time index, and
thus seek to learn in an asynchronous manner. An asynchronous stochastic
variant of the Arrow-Hurwicz saddle point method is proposed to solve this
problem which operates by alternating primal stochastic descent steps and
Lagrange multiplier updates which penalize the discrepancies between agents.
This tool leads to an implementation that allows for each agent to operate
asynchronously with local information only and message passing with neighbors.
Our main result establishes that the proposed method yields convergence in
expectation both in terms of the primal sub-optimality and constraint violation
to radii of sizes and ,
respectively. Empirical evaluation on an asynchronously operating wireless
network that manages user channel interference through an adaptive
communications pricing mechanism demonstrates that our theoretical results
translates well to practice
Online Learning over Dynamic Graphs via Distributed Proximal Gradient Algorithm
We consider the problem of tracking the minimum of a time-varying convex
optimization problem over a dynamic graph. Motivated by target tracking and
parameter estimation problems in intermittently connected robotic and sensor
networks, the goal is to design a distributed algorithm capable of handling
non-differentiable regularization penalties. The proposed proximal online
gradient descent algorithm is built to run in a fully decentralized manner and
utilizes consensus updates over possibly disconnected graphs. The performance
of the proposed algorithm is analyzed by developing bounds on its dynamic
regret in terms of the cumulative path length of the time-varying optimum. It
is shown that as compared to the centralized case, the dynamic regret incurred
by the proposed algorithm over time slots is worse by a factor of
only, despite the disconnected and time-varying network topology. The empirical
performance of the proposed algorithm is tested on the distributed dynamic
sparse recovery problem, where it is shown to incur a dynamic regret that is
close to that of the centralized algorithm
Nonparametric Compositional Stochastic Optimization for Risk-Sensitive Kernel Learning
In this work, we address optimization problems where the objective function
is a nonlinear function of an expected value, i.e., compositional stochastic
{strongly convex programs}. We consider the case where the decision variable is
not vector-valued but instead belongs to a reproducing Kernel Hilbert Space
(RKHS), motivated by risk-aware formulations of supervised learning and Markov
Decision Processes defined over continuous spaces.
We develop the first memory-efficient stochastic algorithm for this setting,
which we call Compositional Online Learning with Kernels (COLK). COLK, at its
core a two-time-scale stochastic approximation method, addresses the fact that
(i) compositions of expected value problems cannot be addressed by classical
stochastic gradient due to the presence of the inner expectation; and (ii) the
RKHS-induced parameterization has complexity which is proportional to the
iteration index which is mitigated through greedily constructed subspace
projections. We establish almost sure convergence of COLK with attenuating
step-sizes, and linear convergence in mean to a neighborhood with constant
step-sizes, as well as the fact that its complexity is at-worst finite. The
experiments with robust formulations of supervised learning demonstrate that
COLK reliably converges, attains consistent performance across training runs,
and thus overcomes overfitting
Conservative Stochastic Optimization with Expectation Constraints
This paper considers stochastic convex optimization problems where the
objective and constraint functions involve expectations with respect to the
data indices or environmental variables, in addition to deterministic convex
constraints on the domain of the variables. Although the setting is generic and
arises in different machine learning applications, online and efficient
approaches for solving such problems have not been widely studied. Since the
underlying data distribution is unknown a priori, a closed-form solution is
generally not available, and classical deterministic optimization paradigms are
not applicable. State-of-the-art approaches, such as those using the saddle
point framework, can ensure that the optimality gap as well as the constraint
violation decay as \O\left(T^{-\frac{1}{2}}\right) where is the number of
stochastic gradients. The domain constraints are assumed simple and handled via
projection at every iteration. In this work, we propose a novel conservative
stochastic optimization algorithm (CSOA) that achieves zero constraint
violation and \O\left(T^{-\frac{1}{2}}\right) optimality gap.
Further, the projection operation (for scenarios when calculating projection
is expensive) in the proposed algorithm can be avoided by considering the
conditional gradient or Frank-Wolfe (FW) variant of the algorithm. The
state-of-the-art stochastic FW variants achieve an optimality gap of
\O\left(T^{-\frac{1}{3}}\right) after iterations, though these algorithms
have not been applied to problems with functional expectation constraints. In
this work, we propose the FW-CSOA algorithm that is not only projection-free
but also achieves zero constraint violation with
\O\left(T^{-\frac{1}{4}}\right) decay of the optimality gap. The efficacy of
the proposed algorithms is tested on two relevant problems: fair classification
and structured matrix completion
Escaping Saddle Points with the Successive Convex Approximation Algorithm
Optimizing non-convex functions is of primary importance in the vast majority
of machine learning algorithms. Even though many gradient descent based
algorithms have been studied, successive convex approximation based algorithms
have been recently empirically shown to converge faster. However, such
successive convex approximation based algorithms can get stuck in a first-order
stationary point. To avoid that, we propose an algorithm that perturbs the
optimization variable slightly at the appropriate iteration. In addition to
achieving the same convergence rate results as the non-perturbed version, we
show that the proposed algorithm converges to a second order stationary point.
Thus, the proposed algorithm escapes the saddle point efficiently and does not
get stuck at the first order saddle points
Adaptive Kernel Learning in Heterogeneous Networks
We consider learning in decentralized heterogeneous networks: agents seek to
minimize a convex functional that aggregates data across the network, while
only having access to their local data streams. We focus on the case where
agents seek to estimate a regression \emph{function} that belongs to a
reproducing kernel Hilbert space (RKHS). To incentivize coordination while
respecting network heterogeneity, we impose nonlinear proximity constraints. To
solve the constrained stochastic program, we propose applying a functional
variant of stochastic primal-dual (Arrow-Hurwicz) method which yields a
decentralized algorithm. To handle the fact that agents' functions have
complexity proportional to time (owing to the RKHS parameterization), we
project the primal iterates onto subspaces greedily constructed from kernel
evaluations of agents' local observations. The resulting scheme, dubbed
Heterogeneous Adaptive Learning with Kernels (HALK), when used with constant
step-sizes, yields attenuation in sub-optimality and
exactly satisfies the constraints in the long run, which improves upon the
state of the art rates for vector-valued problems
Online Learning with Inexact Proximal Online Gradient Descent Algorithms
We consider non-differentiable dynamic optimization problems such as those
arising in robotics and subspace tracking. Given the computational constraints
and the time-varying nature of the problem, a low-complexity algorithm is
desirable, while the accuracy of the solution may only increase slowly over
time. We put forth the proximal online gradient descent (OGD) algorithm for
tracking the optimum of a composite objective function comprising of a
differentiable loss function and a non-differentiable regularizer. An online
learning framework is considered and the gradient of the loss function is
allowed to be erroneous. Both, the gradient error as well as the dynamics of
the function optimum or target are adversarial and the performance of the
inexact proximal OGD is characterized in terms of its dynamic regret, expressed
in terms of the cumulative error and path length of the target. The proposed
inexact proximal OGD is generalized for application to large-scale problems
where the loss function has a finite sum structure. In such cases, evaluation
of the full gradient may not be viable and a variance reduced version is
proposed that allows the component functions to be sub-sampled. The efficacy of
the proposed algorithms is tested on the problem of formation control in
robotics and on the dynamic foreground-background separation problem in video
Sublinear Regret and Belief Complexity in Gaussian Process Bandits via Information Thresholding
Bayesian optimization is a framework for global search via maximum a
posteriori updates rather than simulated annealing, and has gained prominence
for decision-making under uncertainty. In this work, we cast Bayesian
optimization as a multi-armed bandit problem, where the payoff function is
sampled from a Gaussian process (GP). Further, we focus on action selections
via upper confidence bound (UCB) or expected improvement (EI) due to their
prevalent use in practice. Prior works using GPs for bandits cannot allow the
iteration horizon to be large, as the complexity of computing the posterior
parameters scales cubically with the number of past observations. To circumvent
this computational burden, we propose a simple statistical test: only
incorporate an action into the GP posterior when its conditional entropy
exceeds an threshold. Doing so permits us to derive sublinear regret
bounds of GP bandit algorithms up to factors depending on the compression
parameter for both discrete and continuous action sets. Moreover,
the complexity of the GP posterior remains provably finite and depends on the
Shannon capacity of the observation space. Experimentally, we observe state of
the art accuracy and complexity tradeoffs for GP bandit algorithms applied to
global optimization, suggesting the merits of compressed GPs in bandit
settings
- …